In the DOTPLOT folder there is a second program called DTPLT_ED.PRG. This program enables you to change the defaults themselves of the DOTPLOT program. On running the editor you will get a screen looking like:
You will recognize a multi button panel if you see one by now, so I guess this one will not give you too much trouble. The top-left block is for changing the default extensions of DNA and protein files, respectively. If one of these two buttons is clicked upon the program enters the editor mode and you are given the opportunity to edit the old extensions. Likewise you can edit the window- and score-values of the three standard protein score tables, these are all in the elongated box on the left site of the screen. In the middle are two small boxes; one to change the window and score of DNA comparison and the other the Quit option; to stop and leave the program.
The whole of the right of the screen is dedicated to the three extra score tables. Of these the names, windows, scores and comments all can be edited in the same way as with the other tables. An additional option is ``Values''; when this button is clicked upon a new picture will fill your screen:
As you can see it lists the complete set of values of the score table and this will fill allmost halve of your screen with, all but invisible small, lettering. Also present are an exit to return to your previous screen and an edit box. When you enter the second screen, the edit box reports on the Alanine-Alanine couple, and if you look at the table above you will see that the little box on the cross of the A-row and the A-column is indeed in reversed video. To change a value; just type in the new one and it will replace the old one. The corresponding box will switch to normal video and the next one will be activated. If you don't want to change all values but only some there are three ways to activate the box of your choice:
Press <RETURN> and keep it pressed untill you reach the right box.
Use the arrows on your keyboard.
Simply use the mouse to click in the desired box to activate it.
when you are finished click in ``Exit'' and then ``Quit''; all changes will be saved and DOTPLOT will be able to run with new sets of defaults and or a new table.
INFPG5.TXT
To explain this option I will have to tell you something more about the various methods of comparing proteins and their amino acids.
When DNA files are compared the scoring is fairly simple: for every identical base the score is incremented. This method is also available for proteins, but there are other options as well. Some amino acids are chemically more related than others; Glycine (R-H) is nearer to Alanine (R-CH
) than to Cysteine (R-CH
-SH). This fact can be expressed either as a fraction or as equality within a group.
An other approach is to score for evolutionary relatedness. This means two processes have to be considered and expressed in a number. First, the chance of a certain codon mutating into another has to be calculated, and secondly, the fitness of this mutation has to be assessed. Both the chance and the fitness have to be expressed by a single number.
Both the chemical and the evolutionary method have been incorporated into DOTPLOT.
The chemical scoring table is called ``JIMENEZ'' after the man who described it first (1). It does not score for individual amino acids but divides them in groups.
The groups are:
PAGST
QNEDBZ
HKR
All amino acids within the group score equal (=1), between groups they score 0.
The evolutionary approach is represented by ``DAYHOFF'' again named for an important contributor of this work (2), this is a completely individual scoring table. The relatedness of every amino acid with every other amino acid is expressed as a number between 0 and 2.73.
There are three more tables available in DOTPLOT, the contents of which, as well as their names, defaults and comments can all be changed. So if you feel you have developed an improved scoring system you can change one of these tables to fit, complete with an appropriate name and defaults. If you choose to use a scoring table the next step of DOTPLOT is obvious.
INFPG6.TXT
INFPG7.TXT
D/P.PI3
fff@~f
UUUP+
UUUP+
???3>
?~????3>
UUUP+
UUUP+
FPINFO.PI3
fff@~f
UUUP+
UUUP+
UUUP+
UUUP+
LOGO.PI3
[-hja
[5>eZ
1 rk3+
IMK-E6
a&^&&b>8
u)J9#%*
Uu%Hr%
.B5KN
fff@~f
U>zHXj
B%E_Kd
!RI frdDg
@@D,E!0
dfc>1
%oJL$
)1EA
D,! [d
51&I3
QJBD P
)$)(J
$eSE*D
"S5U]B
reRRH
8FI2K
KZnY2n
F$!)"(
S*VeR
rFJE#R
)9%R]IVL
UIE%J\
[UUUU)E%T
UM)E%T
Je*)V
UM%E%
O%5)*
UZUu2
UZUu2
MK%U:
UZSu2
feU[U
5[]kk6
[UWUfZ
[UWUfY
_m]MV
:ejkz
94--]UQJTJ
T2i2u
HBp!$
>"<G#
2B"RT
0x<<@
>b $F
@9J]T
A1%"D
|N$L$
BhQRR'>
BKII"
'''O9
?92y#I+t
Ksg'3'
*2'MET
&3>L4
# FB@
@BHbJH
A2HA"Ju)
A%%:*
D A!$
D A!$
8PPAGE2.PI3
fff@~f
SCORE.PI3
fff@~f
UUUP+
UUUP+
UUUP+
UUUP+
No part of this publication may be copied in any form without the citation of the original artical : Karreman, C (1992) A dotplot program for the Atari ST, capable of assaying DNA and protein sequences. CABIOS
No. 1
DOTPLOT is a program for comparing two protein or DNA sequences and for doing so
quickly
interactively
on the
Atari ST.
Contents
Installation of DOTPLOT.
Running DOTPLOT.
DNA/Protein
Scoring-tables
Length
Reverse
Options.
Zoom-in
Show Homology
Borders
Shift
Expand
Change
Conditions
Parameters
Output
Another run
The DOTPLOT editor.
Principle of DOTPLOT.
The files on the disk.
Formats.
References.
9.
Installation of DOTPLOT.
For the installation of DOTPLOT you will need at least an Atari 1024 ST with a black and white monitor and one single-sided disk drive. Atari computers with more RAM memory, e.g. the Mega ST 2 or Mega ST 4 are also fine; a harddisk will speed up the loading and saving of your files considerably but is not necessary. A color monitor is not compatible with DOTPLOT; you need a high-resolution screen.
Dotplot comes on one floppy disk that contains a total of four subdirectories (folders). The program itself, the program editor and all data files are in the DOTPLOT-folder, the data files are all in their own folder called DATA.
To install DOTPLOT you will have to copy at least the DOTPLOT folder and its contents to the disk you are planning to use for running DOTPLOT. This can be another floppy disk but preferably it will be a logical sector of
our harddisk. Since DOTPLOT will be writing as well as reading the disk during the run, it can not be write-protected. This will make the disk very sensitive to any viruses that are in your system. So to install DOTPLOT, switch off your system (hard reset) and then, before you run anything else make a copy of your DOTPLOT disk. Keep the original disk write-protected at all times.
The other two folders that are on the disk are called DNA and PROTEIN. They contain two DNA and two protein sequence files respectively. When compared with each other by DOTPLOT they will reveal stretches of homology; this will give you some idea about the desirable default values and graphic output.
Although the two latter folders are not absolutely necessary to run the program it is advisable to install these two folders on the same logical sector as your DOTPLOTfolder. The program will look for them, and the files contained in them, first.
If you have copied all the folders you are ready to run DOTPLOT.
Running DOTPLOT.
To run DOTPLOT you start the program by double-clicking on DOTPLOT.PRG, this will load the program and start its execution. The first you will see is the same picture as on the front of these instructions, although it is a very nice picture and you can probably watch it for hours, a single click on one of the mousebuttons or pressing any key will stop the logo. Subsequently the program will ask you if you want any information.
As is customary for most programs the thick-lined box is the default option: if <RETURN> is pressed this option is automarically selected. You can select "Yes" by clicking the left mousebutton after placing the mouse arrow in the in the "Yes"-box. If you want to have more information at this point, here is your opportunity. All the information, of course, is also contained in this set of instructions. The built-in help files are accessible by looking up the item of interest on the INDEX-page (page 2) and selecting the corresponding page by typing the number on the prompt. It is also possible to browse by pressing <RETURN>. After you select "QUIT" the information mode is left behind and you will return to the next option. This is of course the same as you would have encountered if you hadn't opted for information in the first place.
The first question of DOTPLOT.PRG.
The next question of DOTPLOT will probably be of more interest to you as you are now getting impatient for some serious DOTPLOTting. Here your first real choice is made; either for DNA (or RNA see pages 8 and 9) or for proteins.
The INDEX page of the built-in information.
The DNA/Protein options.
As can be seen by the thick lined box <RETURN> will get you the protein option. Since this is the most used (right, thats why it is the default), this leaflet will first cover the events following your pressing <RETURN> or "clicking" the right-hand box. For DNA go directly to page 7.
As soon as you have selected proteins you get your second choice to make.